Arcadia Alignment

mentions 1 type Person feed RSS

// recent coverage 1 mentions

16:18

2026-06-18

lesswrong.com

ai-safety

Your Model Organisms Might Be Fried

Arcadia Alignment's research reveals that current AI model organisms used to study alignment pathologies suffer from degraded coherence, instruction-following, and reasoning, making them poor proxies …

// co-occurs with top 2 entities

Emergent Misalignment 1 MMLU 1